Martini: using literature keywords to compare gene sets
نویسندگان
چکیده
Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing gene sets, most of which find Gene Ontology (GO) terms that are significantly over-represented in one gene set. However, such tools often return GO terms that are too generic or too few to be informative. Here, we present Martini, an easy-to-use tool for comparing gene sets. Martini is based, not on GO, but on keywords extracted from Medline abstracts; Martini also supports a much wider range of species than comparable tools. To evaluate Martini we created a benchmark based on the human cell cycle, and we tested several comparable tools (CoPub, FatiGO, Marmite and ProfCom). Martini had the best benchmark performance, delivering a more detailed and accurate description of function. Martini also gave best or equal performance with three other datasets (related to Arabidopsis, melanoma and ovarian cancer), suggesting that Martini represents an advance in the automated comparison of gene sets. In agreement with previous studies, our results further suggest that literature-derived keywords are a richer source of gene-function information than GO annotations. Martini is freely available at http://martini.embl.de.
منابع مشابه
Towards a Unified Logical Framework of Fuzzy Implications to Compare Fuzzy Sets
In fuzzy set theory, comparison of fuzzy sets plays an important role. Among the several ways to compare fuzzy sets, we address the logical theoretic approach using fuzzy implications. We propose a general framework allowing to generate many measures of comparison: inclusion, similarity and distance, and study their properties. Since the literature on the use fuzzy implications for defining suc...
متن کاملDiagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets
With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...
متن کاملIEEE Paper Template in A4 (V1)
n data mining, clustering techniques have been applied in cellular processes, gene regulation, sub types of cells and gene function. Clustering in microarray gene expression handles various experimental conditions in various algorithms by using different data sets. This paper focuses the study on the clustering of gene expression data using the data sets such as yeast data, yeast cell-cycle, se...
متن کاملEstimating the Saturated Hydraulic Conductivity of Soil Using Gene Expression Programming Method and Comparing It with the Pedotransfer Functions
Saturated hydraulic conductivity of soil is an important physical property of soil that affects water movement in soil, Since the measurement of saturated hydraulic conductivity by direct methods in the field or in the laboratory is hard, time-consuming and costly, the indirect methods are being used.The aim of this study is to estimate the saturated hydraulic conductivity from other soil prope...
متن کاملDevelopment and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions
BACKGROUND Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords. METHODS In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 38 شماره
صفحات -
تاریخ انتشار 2010